Astronomy & Astrophysics manuscript no. reply 


February 5, 2008 


(DOI: will be inserted by hand later) 





u 



in 
o 
o 

(N 

Oi 
<D 

m 

(N 



> 
(N 

t> 

On 
O 

in 
o 

6 



Some comments on the note 

"Some comments on the paper 
Filter design for the detection of compact sources based on the 

Neyman- Pearson detector" 
by M. Lopez-Caniego et. al (2005, MNRAS 359, 993)" 
by R. Vio and P. Andreani 
(astro-ph/0509394) 

M. Lopez-Caniego 1 ' 2 , D. Herranz , R.B. Barreiro , and J.L. Sanz 1 

Instituto de Ffsica de Cantabria, CSIC-UC, Av. los Castros s/n, 39005 Santander, Spain 
e-mail: caniego@ifca.unican.es 
2 Departamento de Ffsica Moderna, Universidad de Cantabria, Facultad de Ciencias, Av. los Castros s/n, 39005 
Santander, Spain 

Received ; accepted 



Abstract. In this note we stress the necessity of a careful check of the arguments used bv lVio fc Andreanil i2005l) 
( VA hereinafter) to criticise the superior performance of the biparametric scale adaptive filter (BSAF) with respect 
to the classic matched filter (MF) in the detection of sources on a random Gaussian bac kground. In particular, we 
point out that a defective reading and understanding of previous works in the literature iRiceil954l : iBarreiro et alJ 
2003; Lopez-Caniego et al.ll200a) leads the authors of VA to the derivation of an incorrect formula and to some 
misleading conclusions. 
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^ ■ 1. Introduction number of detections. Hence the importance of working 
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out new and more powerful detection procedures. 



. In recent times, some controversy has arisen about the 

design of "optimal" filters for the detection of sources em- lLopez-Caniego et alJ i|2005|) have proposed a detection 

bedded in a noisy background. The controversy seems to procedure based on a common practice in Astronomy, that 

be focused on the following question: do we have already consists in identifying possible sources through the pres- 

an optimal tool for detecting such sources or is it worth e nce of "peaks" in the data. Commonly, the data is previ- 

trying to find better methods for the task? ously filtered in order to improve the detectability of the 

In lLopez-Caniego et all |2005l) and some previous sources. Then, some decision rule is applied on the peaks 

works l|Barreiro et all l2003t lLopez-Caniego et all 12004 in order to determine whether they correspond to sources 

20053) we have explored the detection problem in the con- or n °t- 
text of astronomical data mining. The motivation of our 

work has been the need to detect extragalactic objects, A typical decision rule is based on the idea of ampli- 

often referred to as "point sources" due to their small an- tude thresholding, that is, the hypothesis that a source 

gular size, in microwave Astronomy. Since the number of is present in any considered point is accepted if the 

these objects increases very quickly as their flux decreases, amplitude of the observed data at that point is higher 

even a small improvement in our ability to notice faint than a certain value. A decision rule based only on the 

extragalactic objects can lead to a significant rise in the amplitude at the point where the decision has to be 

made is missing information on the local structure of the 

source and the background where it is embedded. Thus, 

Send offprint requests to: M. Lopez-Caniego in order to increase the power of the decision rule, in 
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lLopez-Caniego et al.l l)2005l) we considered not only the 
amplitude of the peaks, b ut also the curvature. 

Following this idea, in lLopez-Caniego et al.1 l)2005(l we 
considered as decision rule the Neyman-Pearson detector, 
which gives the highest number of detections for a given 
number of false alarms; since the Neyman-Pearson detec- 
tor was to be applied to local maxima (peaks), first we 
derived the expressions for the number density of peaks in 
an interval (x + dx) with amplitude (y + dv) and cur- 
vature (k + dn) both in presence and in absence of a 
source. These number densities depend on the properties 
of the background, and these properties can be modified, 
up to a certain extent, through filtering. Then, we ex- 
plored the performance of a linear filter, the Biparametric 
Scale- Adaptive Filter (BSAF), that depends on a small 
number of parameters that can be chosen so that the per- 
formance of the Neyman-Pearson detector is optimised. 
We found that the BSAF outperforms other filters such 
as the Mexican Hat Wavelet and the standard matched 
filter (MF) in so me interesting case s. 

Very recently, IVio fc Andreanil feOOSl) have criticised 
the work presented in lLopez-Caniego et alJ l|2005i) . We feel 
compelled to warn the community against some of their 
remarks, that result to be either fruit of a bad interpreta- 
tion of our work or just plainly wrong. 

2. Some comments on the comments by VA 

In the introduction of their note, VA reproduce some 
very well-known results about the matched filter in the 
context of the Neyman-Pearson theorem, which can be 
found in any basic signal de tection textbook (for example 
Wai nstein fc ZubakovlllQBl . Though this is unquestion- 
ably correct, only the amplitude of the signal is consid- 
ered and, besides, it is still necessary to provide a crite- 
rion to localise and define a single source among the set 
of pixels above the threshold. We have followed a different 
approach that incorporates the identification of any single 
source through the presence of a local maximum and in- 
formation about the curvature. Hence, our work is not in 
contradiction with the scheme reproduced by VA, because 
we are following a different, more complete, approach. 

In addition, VA make three comments about our ap- 
proach. In the first comme nt, VA point out that one of our 
equations (equation (8) in lLopez-Caniego et al.l l|2005|) ') is 
not correct. However, this statement is not true and the 
alternative equation they propose is actually wrong. 

In our work, we construct the Neyman-Pearson detec- 
tor using the number density of maxima of the background 
and th e sam e numb er in the presence of background and 
source. iRicel obtains the number density of max- 

ima in an interval (x + dx) with amplitude (y + dv) and 
curvature (k + dn) for a Gaussian background as: 



n b (v, k) 



rib 



: exp 



2pi>n 



2(1 -P 2 ) 



(1) 



where v S (— oo,+oo) and k S (0, +oo). Note that the 
probability density Pb{v, k) is straightforwardly obtained 



by dividing the previous equation by the total number 
density rib. 

To obtain the probability density in the presence of a 
point source of amplitude v s and curvature k s , VA simply 
substitute v — > v — v s and k — > k — n s in pbiy, k): 



p(y,n\v s 



x exp 



1 



(y - v s ) 2 + (k — n s ) 2 - 2p(v - v s ){n - n s ) 



2(1 



(2) 



and indicate that v € (— oo, +oo) and k € (k s ,+oo). 
However, the derivation of this equation can not be done 
in such a simple way. First of all, one needs to construct 
the joint probability density of the field, its first and its 
second derivative (where terms of the form v— v s and k—k s 
appear). From this joint probability, one follows the pro- 
cedure explained inlRicel lu954f) . lBardeen et all l|l986|) and 
iBond fc Efstathioul (|l987) obtaining the number density 
of maxima in the intervals (x + dx), (v + dv) and (n + dn): 



n{v, k\v s 



x exp 



rib 



[y - v s ) 2 + (k — k s ) 2 - 2p(v - v s )(k - k s ) 



2(1 -P 2 ) 



(3) 



where v 6 (— oo,+oo) and k £ (0,+oo). Note that the 
factor k that multiplies the exponential comes in from 
imposing the condition of having a maximum and it 
refers to the total curvature given by the background plus 
source (not only to the background as stated by VA). We 
would like to remark that equation © gives the num- 
ber density of maxima coming from the combination of 
the background plus source. This does not mean, at all, 
that the maximum of the source has to coincide with a 
maximum of the noise process as stated by VA. In addi- 
tion, VA claims that n € (k s ,+oo), since they wrongly 
assume that the maximum of the global field has to coin- 
cide with a maximum of the background. However this is 
not true and therefore there is no reason to restrict k to 
such interval. In fact k can take values from (0, +oo). Note 
that this is another indication of the fact that equation J2J 
proposed by VA is wrong, since this probability can take 
negative values when considering the correct interval for 

K. 

Regarding the second comment of VA, they criticise 
the fact that we work on a filtered version of the original 
signal. We would like to stress that the common procedure 
in astronomy (and other fields) for object detection is to 
filter the original image in order to enhance the sources 
and then detect and identify those sources. Thus, an im- 
portant issue is not only to find the optimal filter, but 
also which is the criterion to identify the sources. In our 
approach, we a priori identify the maxima of the filtered 
image as source candidates. Then, we apply a Neyman- 
Pearson detector to decide whether the maximum is due 
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or not to the presence of a point source. Note that since 
we are considering the identification through the idea of 
maxima, it is natural to define the Neyman-Pearson de- 
tector in terms of number density of maxima. Taking into 
account these ideas, we explore different filters and find 
that the BSAF outperforms the other filters (including 
the MF) in some cases. Furthermore, VA suggest that we 
are trying to find an approximate solution to the decision 
problem based on the likelihood ratio 



L(x, v, k) = — p^-^ > 7 



(4) 



p(x, v, k\H ) 

It is not clear to us what VA mean with this notation. 
If x is the observed 1-dimensional signal (as VA defined 
in their introduction), v is redundant. Also, if n refers 
to the whole image, it should be a vector, k. We under- 
stand that what VA mean is to construct a likelihood ra- 
tio using the amplitude and curvature of all the pixels 
in the image (in fact, if using all the pixels, one should 
also introduce the information on the first derivative). 
However, this procedure does not make sense in our ap- 
proach, since we are considering only the maxima of the 
image. Note that in the approach suggested by VA, one 
would also need, a posteriori, a criterion to identify which 
points of the image (from those above the threshold 7) 
correspond to each source. In addition, VA claim that 
our conclusions are drawn only on the basis of numer- 
ical ex periments. However, most of lLopez-Caniego et alJ 
l)2005(l is devoted to present the theoretical framework of 
our method. Simulations are then performed in order to 
test the theoretical results. Therefore, the criticisms of 
VA in their second comment are not well founded. 

The third comment of VA refers to the procedure fol- 
lowed in the numerical experiments. They criticise that 
we consider only those sources whose peak is not moved 
to another pixel. We would like to remark that the aim 
of our work was to present a novel theoretical framework 
for object detection and to test it with numerical simula- 
tions. Therefore, we try to reproduce exactly the theoret- 
ical scheme with our simulations and focus only on what 
happens in one pixel of the image, the pixel in which the 
so urce is located. In fact , this point was already discussed 
111 iBarreiro et afl l)2003() . finding that the different filters 
there considered lead to similar number of detections in 
the neighbouring pixels of the source and, thus, it did not 
affect the conclusions. In any case, in the more realistic 
case when all the pixels of the image are considered, the 
conclusion that the BSAF detects more sources than the 
other filters in the correct localisation remains true. 

Finally, VA comment in their conclusions that the per- 
formance of our filter is based on strong a priori assump- 
tions such as the Gaussianity of the background and the 
symmetry of the source profile. We would like to remark 
that many real fields do follow a Gaussian distribution 
and therefore this is a very common and realistic assump- 
tion. In fact, in their introduction, VA also assume the 
Gaussianity of the background to show that the statis- 
tic given by the Neyman-Pearson detector (when only 



information about the amplitude is used) leads to the 
MF. Regarding the symmetry of the source profile, the 
filters can be generalised without any difficulty to non- 
symmetrical profiles. 

3. Conclusions 

In a note recently appeared in astr o-ph, VA have made 
some comments about our work l|L6pez-Ganiego et alJ 
2005). In this note, we have carefully checked their ar- 
guments. The main comments made by VA are three. Let 
us summarise: 

In their first comment, VA have questioned an al- 
legedly unproven formula in our work, which is in fact 
rigor ously derived from previous works in the litera- 
tureJRicflll95l iBardeen et alJll983 iBond fc Efat.atbiol 
1987). Instead, VA have proposed an incorrect formula. 

In their second comment, VA cri ticise the lack of gen- 
erality of the approach proposed in lLopez-Ganiego et alJ 
( 2005). In particular, VA criticise the idea of filtering the 
data and applying the Neyman-Pearson detector to the 
local maxima. Instead, they suggest that a generalisation 
of the derivation of the Neyman-Pearson detector, includ- 
ing not only amplitudes but also the second derivatives 
of the field, should be done on a purely theoretical basis. 
Nevertheless, they are not able to provide such a theoreti- 
cal derivation, and the likelihood ratio they propose is not 
general either, since it does not include the first derivative 
of the field, that outside the maxima is not zero. Our ap- 
proach, however, is consistent and it leads to an improve- 
ment in the number of detections. 

In their third comment, VA criticise a set of numeri- 
cal experiments designed to test our theoretical arguments 
precisely for being designed to test our theoretical argu- 
ments. They suggest instead to make numerical experi- 
ments in order to test what the theory does not say. We 
have derived the number densities of maxima in two cases: 
when a source is located at the position of the maxima (not 
"nearby the maxima") and when there is no source. The 
way to test the hypothesis e xpressed by these formulae i s 
exactly the one explained in lLopez-Ganiego et a 

Besides the three main comments mentioned above, 
VA made a few others. One of them is that VA claim 
that our conclusions are drawn only on the basis 
of numerical experiments , which is plainly false. In 
lLonez-Caniego et al.1 l|2005l) we give a theoretical founda- 
tion for our method, we make predictions based on the 
theory and then we check those predictions with numeri- 
cal simulations. The agreement is excellent. 

Other main objection is that our proposed method 
seems rather complicated. Though it is true that simplic- 
ity is an aesthetically admirable quality, we feel that a 
little complexity should not scare scientists in their work. 
As mentioned in the introduction of this note, it is worth 
to work hard to improve the capability of detection of 
our statistical methods, even if the improvement is a few 
percent, because it may lead to a significant rise in the 
number of extragalactic objects detected. 
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Finally, VA blame our method for doing stringent a 
priori assumptions, namely two: symmetry of the source 
profile and Gaussianity of the background. It is false that 
our method requires symmetry of the source: it was as- 
sumed only for simplicity but the filters can be generalised 
to non-symmetric profiles just as the standard matched 
filter can. Regarding Gaussianity, VA make in their intro- 
duction exactly the same assumption as we do. 
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